This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Improve quantization flow #15961
Merged
pengzhao-intel
merged 15 commits into
apache:master
from
ZhennanQin:smart_quantize_fast
Aug 29, 2019
Merged
Improve quantization flow #15961
pengzhao-intel
merged 15 commits into
apache:master
from
ZhennanQin:smart_quantize_fast
Aug 29, 2019
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Change-Id: Id9d7504890852faf4e84fdcd66585d1fd78beeb2
Change-Id: I4c82f64dbef501d2560f7ffee93991119a66a5ee
ZhennanQin
requested review from
anirudh2290,
eric-haibin-lin and
szha
as code owners
August 21, 2019 12:18
@KellenSunderland the requests from your team :) |
ZhennanQin
force-pushed
the
smart_quantize_fast
branch
2 times, most recently
from
August 22, 2019 00:51
ad36eb0
to
98261b4
Compare
ZhennanQin
force-pushed
the
smart_quantize_fast
branch
from
August 22, 2019 01:07
98261b4
to
b02c1a7
Compare
@ZhennanQin @xinyu-intel please rebase the code and retrigger the CI |
pengzhao-intel
approved these changes
Aug 29, 2019
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not easy to pass the CI.
Merging now for the customer request.
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
@pengzhao-intel @TaoLv @xinyu-intel @reminisce @KellenSunderland @anirudh2290
Major changes:
Don't need
calib_layer
, the layer list which needs calibrated will be generated from quantization pass, so user don't need to specify that. Only needed layer get calibrated, thus improve the calibration speed.Entropy is refactored, only histogram of output is saved, which will help to reduce memory consumption. Entropy calculation is refactored as c++ operator, accuracy is improved, then entropy method can get same speed as naive.
Add new quantization mode
smart
, which will automatically decide each op should be quantized or not. This mode will only quantize nodes which have performance benefit(e.g. convolution and FC), and necessary nodes. For example,A
is convolution or FC, which will be all quantized.B
is Relu or Add, which is quantizable and quantization flow will make decision whether to quantize it or not.C
is non-quantized node. ForA -> B -> A
,B
will be quantized as it can pass down int8 data. ForC->B->C
,A -> B -> C
, orC -> B -> A
,B
won't be quantized.Add log for quantization flow, this can help to user to understand what quantization flow does and what's changed.
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments